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Abstract — We investigate optimal resource allocation for delay- 
limited cooperative communication in time varying wireless 
networks. Motivated by real-time applications that have stringent 
delay constraints, we develop a dynamic cooperation strategy 
that makes optimal use of network resources to achieve a 
target outage probability (reliability) for each user subject to 
average power constraints. Using the technique of Lyapunov 
optimization, we first present a general framework to solve this 
problem and then derive quasi-closed form solutions for several 
cooperative protocols proposed in the literature. Unlike earlier 
works, our scheme does not require prior knowledge of the 
statistical description of the packet arrival, channel state and 
node mobility processes and can be implemented in an online 
fashion. 

Index Terms — Cooperative Communication, Delay-Limited 
Communication, Mobile Ad-Hoc Networks, Reliability, Resource 
Allocation, Lyapunov Optimization 

I. Introduction 

There is growing interest in the idea of utilizing cooperative 
communication H), GO, GL El, 0, GO to improve the 
performance of wireless networks with time varying channels. 
The motivation comes from the work on MIMO systems l25l 
which shows that employing multiple antennas on a wireless 
node can offer substantial benefits. However, this may be 
infeasible in small- sized devices due to space limitations. 
Cooperative communication has been proposed as a means 
to achieve the benefits of traditional MIMO systems using 
distributed single antenna nodes. Much recent work in this 
area promises significant gains in several metrics of interest 
(such as diversity E), capacity @, 0, 0, 00, 0, energy 
efficiency ifTOlh ifTTIl . etc.) over conventional methods. We refer 
the interested reader to a recent comprehensive survey and 
its references. 

The main idea behind cooperative communication can be 
understood by considering a simple 2 -hop network consisting 
of a source s, its destination d and a set of m relay nodes 
as shown in Fig. \T\ Suppose s has a packet to send to d in 
timeslot t. The channel gains for all links in this network are 
shown in the figure. In direct communication, s uses the full 
slot to transmit its packet to d over link s — d as shown in 
Fig- Uta). In conventional multi-hop relaying, s uses the first 
half of the slot to transmit its packet to a particular relay node 
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Fig. 1. Example 2-hop network with source, destination and relays. The 
time slot structures for different transmission strategies are also shown. Due 
to the half-duplex constraint, cooperative protocols need to operate in two 
phases. Hence, there is an inherent loss in the multiplexing gain under any 
such cooperative transmission strategy over direct transmission. 



i over link s — i as shown in Fig. UXb). If i can successfully 
decode the packet, it re-encodes and transmits it to d in the 
second half of the slot over link i — d. In both scenarios, 
to ensure reliable communication, the source and/or the relay 
must transmit at high power levels when the channel quality 
of any of the links involved is poor. However, note that due 
to the broadcast nature of wireless transmissions, other relay 
nodes may receive the signal from the transmission by s and 
can cooperatively relay it to d. The destination now receives 
multiple copies/signals and can use all of them jointly to 
decode the packet. Since these signals have been transmitted 
over independent paths, the probability that all of them have 
poor quality is significantly smaller. Cooperative communica- 
tion protocols take advantage of this spatial diversity gain by 
making use of multiple relays for cooperative transmissions to 
increase reliability and/or reduce energy costs. This is different 
from traditional multi-hop relaying in which only one node 
is responsible for forwarding at any time and in which the 
destination does not use multiple signals to decode a packet. 

Because of the half-duplex nature of wireless devices, a 
relay node cannot send and receive on the same channel 
simultaneously. Therefore, such cooperative communication 
protocols typically operate over a two phase slot structure as 
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shown in Figs. QIc) and \Hd). In the first phase, s transmits 
its packet to the set of relay nodes. In the second phase, a 
subset of these relays transmit their signals to d. Note that 
the destination may receive the source signal from the first 
phase as well. At the end of the second phase, the destination 
appropriately combines all of these received signals to decode 
the packet. The exact slot structure as well as the signals 
transmitted by the relays depend on the cooperative protocol 
being usedQ For example, Fig. [TJc) shows the slot structure 
under a cooperative scheme that transmits over orthogonal 
channels. Specifically, the time slot is divided into m + 1 
equal mini-slots. In phase one, the source transmits its packet 
in the first mini-slot. In the second phase, the relays transmit 
one after the other in their own mini- slots. Fig. |TJd) shows 
the slot structure under a cooperative scheme in which the 
cooperating relays use distributed space-time codes (DSTC) 
or a beamforming technique to transmit simultaneously in the 
second phase. It should be noted that due to this half-duplex 
constraint, there is an inherent loss in the multiplexing gain 
under any such cooperative transmission strategy over direct 
transmission. Therefore, it is important to develop algorithms 
that cooperate opportunistically. 

In this work, we consider a mobile ad-hoc network with 
delay-limited traffic and cooperative communication. Many 
real-time applications (e.g., voice) have stringent delay con- 
straints and fixed rate requirements. In slow fading environ- 
ments (where decoding delay is of the order of the channel 
coherence time), it may not be possible to meet these delay 
constraints for every packet. However, these applications can 
often tolerate a certain fraction of lost packets or outages. 
A variety of techniques are used to combat fading and meet 
this target outage probability (including exploiting diversity, 
channel coding, ARQ, power control, etc.). Cooperative com- 
munication is a particularly attractive technique to improve 
reliability in such delay-limited scenarios since it can offer 
significant spatial diversity gains in addition to these tech- 
niques. 

Much prior work on cooperative communication considers 
physical layer resource allocation for a static network, partic- 
ularly in the case of a single source. Objectives such as mini- 
mizing sum power, minimizing outage probability, meeting a 
target SNR constraint, etc., are treated in this context 0, ifTOlL 
fill d, CI, OS, d, G2. We draw on this work in the 
development of dynamic resource allocation in a stochastic 
network with fading channels, node mobility, and random 
packet arrivals, where opportunistic cooperation decisions are 
required. Dynamic cooperation was also considered in the 
prior work [ 1 8 ] which investigates throughput optimality and 
queue stability in a multi-user network with static channels and 
randomly arriving traffic using the framework of Lyapunov 
drift. Our formulation is different and does not involve issues 
of queue stability. Rather, we consider a delay-limited scenario 
where each packet must either be transmitted in one slot, 
or dropped. This is similar to the concept of delay-limited 
capacity lfT9l . Also related to such scenarios is the notion 
of minimum outage probability l20l . These quantities are 

! We consider several protocol examples in Sec. M 



also investigated in the recent work fTH that considers a 
3 node static network with Rayleigh fading and shows that 
opportunistic cooperation significantly improves the delay- 
limited capacity. 

In this work, we use techniques of both Lyapunov drift and 
Lyapunov optimization ll24l to develop a control algorithm that 
takes dynamic decisions for each new slot. Different from most 
work that applies this theory, our solution involves a 2 -stage 
stochastic shortest path problem due to the cooperative relay- 
ing structure. This problem is non-convex and combinatorial 
in nature and does not admit closed form solutions in general. 
However, under several important and well known classes of 
physical layer cooperation models, we develop techniques for 
reducing the problem exactly to an m- stage set of convex 
programs. The convex programs themselves are shown to have 
quasi-closed form solutions and can be computed in real time 
for each slot, often involving simple water-filling strategies 
that also arise in related static optimization problems. 

II. Basic Network Model 

We consider a mobile ad-hoc network with delay-limited 
communication over time varying fading channels. The net- 
work contains a set M of nodes, all potentially mobile. All 
nodes are assumed to be within range of each other, and any 
node pair can communicate either through direct transmission 
or through a 2-phase cooperative transmission that makes use 
of other nodes as relays. The system operates in slotted time 
and the channel coefficient between nodes i and j in slot t is 
denoted by hij(t). We assume a block fading model [|25l for 
the channel coefficients so that their value remains fixed during 
a slot and changes from one slot to the other according to the 
distribution of the underlying fading and mobility processes. 

For simplicity, we assume that the set J\f contains a single 
source node s and its destination node d and that all other 
nodes act simply as cooperative relays. This is similar to the 
single-source assumption treated in lH2l . |[T3lL fl4) . ifTSl . lfl~6l 
for static networks. We derive a dynamic cooperation strategy 
for this single source problem in Sec. HV] that optimizes a 
weighted sum of reliability and power expenditure subject 
to individual reliability and average power constraints at the 
source and at all relays. This highlights the decisions involved 
from the perspective of a source node, and these decisions and 
the resulting solution structure are similar to the multi- source 
scenario operating under an orthogonal medium access scheme 
(such as TDMA or FDMA) studied later in Sec. [VTU In the 
following, we^ denote the set ofjelay nodes by 1Z and the set 
{s} U 1Z by 1Z. All nodes i G 1Z have both long term average 
and instantaneous peak power constraints given by P^ VQ and 
prnax respectively. 

We consider two models for the availability of the channel 
state information (CSI). The first is the known channels, 
unknown statistics model. Under this model, we assume that 
the channel gains between the source node and its relay set 
and destination as well as the channel gains between the 
relays and the destination are known every slot. These could 
be obtained by sending pilot signals and via feedback. This 
model has also been considered in prior works [12], lfT3ll . EH , 
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[f]~5l on power allocation in static networks where, in addition 
to the current channel gains, a knowledge of the distribution 
governing the fading process is assumed. In our work, under 
this known channels, unknown statistics model, we do not 
assume any knowledge of the distributions governing the 
evolution of the channel states, mobility processes, or traffic. 
Thus, our algorithm and its optimality properties hold for a 
very general class of channel and mobility models that satisfy 
certain ergodicity requirements (to be made precise later). We 
note that the channel gain could represent just the amplitude 
of the channel coefficient if an orthogonal cooperative scheme 
is being used. However, in case of cooperative schemes such 
as beamforming, this could represent the complete description 
of the fading coefficient that includes the phase information. 

The second model we consider is the unknown channels, 
known statistics model. In this case, we assume that the current 
set of potential relay nodes is known on each slot t, but 
the exact channel realizations between the source and these 
relays, and the relays and the destination, are unknown. Rather, 
we assume only that the statistics of the fading coefficients 
are known between the source and current relays, and the 
current relays and destination. However, we still do not require 
knowledge of the distributions governing the arriving traffic or 
the mobility pattern (which affects the set of relays we will 
see in future slots). This is in contrast to prior works that have 
considered resource allocation in the presence of partial CSI 
only for static networks. 

For both models, we use T(t) to represent the collection of 
all channel state information known on slot t. For the known 
channels, unknown statistics model, T(t) represents the col- 
lection of channel coefficients (t) between the source and 
relays and relays and destination. For the unknown channels, 
known statistics model, T(t) represents the set of all nodes 
that are available on slot t for relaying and the distribution of 
the fading coefficients. We assume that T(t) lies in a space 
of finite but arbitrarily large size and evolves according to an 
ergodic process with a well defined steady state distribution. 
This variation in channel state information affects the reli- 
ability and power expenditure associated with the direct and 
cooperative transmission modes that are discussed in Sec. lII-Bl 

A. Example of Channel State Information Models 

As an example of these models, suppose the nodes move in 
a cell-partitioned network according to a Markovian random 
walk (see also Fig. [2] in Sec. IVIIII on Simulations). Each 
slot, a node may decide to stay in its current cell or move 
to an adjacent cell according to the probability distribution 
governing the random walk. Suppose that each slot, the set 
of potential relays consists only of nodes in either the same 
or an adjacent cell of the source. Suppose channel gains 
between nodes in the same cell are distributed according to 
a Rayleigh fading model with a particular mean and variance, 
while gains for nodes in adjacent cells are Rayleigh with 
a different mean and variance. Under the known channels, 
unknown statistics model, the T(t) information is the set 
of current gains hij(t), and the Rayleigh distribution is not 
needed. Under the unknown channels, known statistics model, 



the T(t) information is the set of nodes currently in the same 
and adjacent cells of the source, and we assume we know 
that the fading distribution is Rayleigh, and we know the 
corresponding means and variances. However, neither model 
requires knowledge of the mobility model or the traffic rates. 

B. Control Options 

Suppose the slot size is normalized to integer slots t G 
{0, 1,2, ...,}. In each slot, the source s receives new packets 
for its destination d according to an i.i.d. Bernoulli process 
A s (t) of rate A s . Each packet is assumed to be R bits long 
and has a strict delay constraint of 1 slot. Thus, a packet 
not served within 1 slot of its arrival is dropped. Further, 
packets that are not successfully received by their destinations 
due to channel errors are not retransmitted. The source node 
has a minimum time-average reliability requirement specified 
by a fraction p s which denotes the fraction of packets that 
were transmitted successfully. In any slot t, if source s has a 
new packet for transmission, it can use one of the following 
transmission modes (Fig. [I}: 

1) Transmit directly to d using the full slot 

2) Transmit to d using traditional relaying over two hops 

3) Transmit cooperatively with the set 1Z of relay nodes 
using the two phase slot structure 

4) Stay idle (so that the packet gets dropped) 

We consider all of these transmission modes because, de- 
pending on the current channel conditions and energy costs 
in slot t, it might be better to choose one over the other. For 
example, due to the half-duplex constraint, direct transmission 
using the full slot might be preferable to cooperative transmis- 
sion over two phases on slots when the source-destination link 
quality is good. Note that this is similar to the much studied 
framework of opportunistic transmission scheduling in time 
varying channels. Further, even in the special case of static 
channels, the optimal strategy may involve a mixture of these 
modes of operation to meet the target reliability and average 
power constraints. 

Let X 71 (t) denote the collective control action in slot t under 
some policy r] that includes the choice of the transmission 
mode at the source, power allocations for the source and all 
relevant relays, and any additional physical layer choices such 
as modulation and coding. Specifically, we have: 

X^(t) = [mode choice, P^it), other PHY layer choices] 

where the mode choice refers to one of the 4 transmission 
modes for the source, and where P v (t) is the collection of 
coefficients P^(i) representing power allocations for each 
node i e7Z. Note that (t) = for all i under transmission 
mode 4 (idle). If the source s chooses mode 1, we have 
Pi(t) = for all relay nodes i G 1Z, whereas if s chooses 
mode 2, we have Pi(t) > for at most one relay i G 1Z. 
Note that under any feasible policy 77, P^(t) must satisfy the 
instantaneous peak power constraint every slot for all i. Also 
note that under the cooperative transmission option, the power 
allocation for the source node and the relays corresponds to the 
first and second phase respectively. Thus, the source is active 
in the first phase while the relays are active in the second 



4 



phase. We denote the set of all valid power allocations by V 
and define C as the set of all valid control actions: 

C = {1, 2, 3, 4} x {V} x {other PHY layer choices} 

The success/failure outcome of the control action is rep- 
resented by an indicator random variable ^ s (X r] (t)^T(t)) 
that depends on the current control action and channel state. 
Successful transmission of a packet is usually a complicated 
function of the transmission mode chosen, the associated 
power allocations and channel states, as well as physical layer 
details like modulation, coding/decoding scheme, etc. In this 
work, the particular physical layer actions are included in the 
X v (t) decision variable. Specifically, given a control action 
X v (t) and a channel state T(t), the outcome is defined as 
follows: 

{1 if a packet transmitted by s in slot 
t is successfully received by d 
else 

(1) 

Note that ^ s (X r] (t) 1 T(t)) is a random variable, and its 
conditional expectation given (X^(t),T(t)) is equal to the 
success probability under the given physical layer channel 
model. Use of this abstract indicator variable allows a unified 
treatment that can include a variety of physical layer models. 
Under the known channels, unknown statistics model (where 
T (t) includes the full channel realizations between source and 
relays and relays and destination on slot t), ^ s (X v (t) J T(t)) 
can be a determinisitic 0/1 function based on the known 
channel state and control action. Specific examples for this 
model are considered in Sec.|Vl Under the unknown channels, 
known statistics model (where T(t) represents only the set of 
current possible relays and the fading statistics), we assume 
we know the value of Pr[^ s (X r] (t) 1 T(t)) = 1] under each 
possible control action X v {t). This model is considered in Sec. 
fvTI Under both models, we assume that explicit ACK/NACK 
information is received at the end of each slot, so that the 
source knows the value of ^ s (X r] (t)^T(t)). For notational 
convenience, in the rest of the paper, we use instead of 

$ 3 (Ti(t),T(t)) noting that the dependence on (X^(t),T(t)) 
is implicit. 

C. Discussion of Basic Model 

The basic model described above extends prior work on 2- 
phase cooperation in static networks to a mobile environment, 
and treats the important example scenario where a team of 
nodes move in a tight cluster but with possible variation in 
the relative locations of nodes within the cluster. We note that 
our model and results are applicable to the special case of 
a static network as well. Another example scenario captured 
by our model is an OFDMA-based cellular network with 
multiple users that have both inter-cell and intra-cell mobility. 
In each slot, a set of transmitters is determined in each 
orthogonal channel (for example, based on a predetermined 
TDM A schedule, or dynamically chosen by the base station). 
The remaining nodes can potentially act as cooperative relays 
in that slot. 



The basic model treats scenarios in which a source node 
can transmit to its destination, possibly with the help of 
multiple relay nodes, in 2 stages. While this is a simplifying 
assumption, the framework developed here can be applied to 
more general scenarios in which, in a single slot, cooperative 
relaying over K stages is performed (for some K > 2) using 
multi-hop cooperative techniques (e.g., EH . [22]). 

III. Control Objective 

Let a s and fa for i G 1Z be a collection of non-negative 
weights. Then our objective is to design a policy r] that solves 
the following stochastic optimization problem: 

Maximize: a s f^ — ^ (3^ 
ten 

Subject to: > p s X s 

<p t avg vien 
o<if (*) < p z max v i e n, yt 

VtyeCVt (2) 

where is the time average reliability for source s under 
policy r] and is defined as: 

t-i 

f?A lim 7 £E{*2(r)} (3) 
and is the time average power usage of node i under 77: 

e7Ai im lg E{ ^ (r)} (4 ) 

Here, the expectation is with respect to the possibly ran- 
domized control actions that policy 77 might take. The a s and 
Pi weights allow us to consider several different objectives. 
For example, setting a s = and = 1 for all i reduces 
([2) to the problem of minimizing the average sum power 
expenditure subject to minimum reliability and average power 
constraints. This objective can be important in the multiple 
source scenario when the resources of the relays must be 
shared across many users. Setting all of these weights to 
reduces O to a feasibility problem where the objective is 
to provide minimum reliability guarantees subject to average 
power constraints. 

Problem $2^ is similar to the general stochastic utility max- 
imization problem presented in [24] . Suppose © is feasible 
and let rj and e* Vz G 1Z denote the optimal value of 
the objective function, potentially achieved by some arbitrary 
policy. Using the techniques developed in l24ll . l23l . it can 
be shown that it is sufficient to consider only the class of 
stationary, randomized policies that take control decisions 
purely as a (possibly random) function of the channel state 
T(t) every slot to solve ©. However, computing the optimal 
stationary, randomized policy explicitly can be challenging 
and often impractical as it requires knowledge of arrival 
distributions, channel probabilities and mobility patterns in 
advance. Further, as pointed out earlier, even in the special case 
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of a static channel, the optimal strategy may involve a mixture 
of direct transmission, multi-hop, and cooperative modes of 
operation, and the relaying modes must select different relay 
sets over time to achieve the optimal time average mixture. 

However, the technique of Lyapunov optimization ll24ll 
can be used to construct an alternate dynamic policy that 
overcomes these challenges and is provably optimal. Unlike 
the stationary, randomized policy, this policy does not need 
to be computed beforehand and can be implemented in an 
online fashion. In the known channels model, it does not need 
a-priori statistics of the traffic, channels, or mobility. In the 
unknown channels model, it does not need a-priori statistics 
of the traffic or mobility. We present this policy in the next 
section. 

IV. Optimal Control Algorithm 

In this section, we present a dynamic control algorithm 
that achieves the optimal solution r* and e* \/i G 1Z to 
the stochastic optimization problem presented earlier. This 
algorithm is similar in spirit to the backpressure algorithms 
proposed in ll24lL [|23l for problems of throughput and energy 
optimal networking in time varying wireless ad-hoc networks. 

The algorithm makes use of a "reliability queue" Z 3 (t) for 
source s. Specifically, let Z 3 (t) be a value that is initialized 
to zero (so that Z s (0) = 0), and that is updated at the end of 
every slot t according to the following equation: 

Z a (t + 1) = max[Z s (t) - & a (t),0]+p a A a (t) (5) 

where A s (t) is the number of arrivals to source s on slot t 
(being either or 1), and <& 8 (t) is 1 if and only if a packet 
that arrived was successfully delivered (recall that ACK/NACK 
information gives the value of $ 3 (t) at the end of every slot t). 
Additionally, it also uses the following virtual power queues 
Vz G 7Z: 

Xi(t + 1) = mnx[Xi(t) - P? v \ 0] + P^t) (6) 

All these queues are also initialized to and updated at the 
end of every slot t according to the equation above. We note 
that these queues are virtual in that they do not represent any 
real backlog of data packets. Rather, they facilitate the control 
algorithm in achieving the time average reliability and energy 
constraints of © as follows. If a policy r\ stabilizes ©, then 
we must have that its service rate is no smaller than the input 
rate, i.e., 

-j t-i t-i 
f? = lim - $>{*y(r)} > lim - ^E{p s A s (r)} = p s X s 

r=0 r=0 

Similarly, stabilizing © yields the following: 

e?=limJ^{PM<r 

00 T = 

where we have used definitions ©, ©. This technique of turn- 
ing time-average constraints into queueing stability problems 
was first used in l23l . 

To stabilize these virtual queues and optimize the objective 
function in ©, the algorithm operates as follows. Let Q(t) = 



(Z 3 (t),Xi(t)) Vz G 1Z denote the collection of these queues 
in timeslot t. Every slot t, given Q(t) and the current channel 
state T(t), it chooses a control action X*(t) that minimizes 
the following stochastic metric (for a given control parameter 
V>0): 

Minimize: (X s (t) + V/3 a )E {P a (t)\Q{t), T(t)} + 

^(X,(t)+FA)E{P,(t)|e(t),T(t)}- 

%en 

(Z a (t)+Va a )E{$ a (t)\Q(t),T(t)} 
Subject to: < < P™ ax G K 

1(t) G C (7) 

After implementing X* (t) and observing the outcome, the 
virtual queues are updated using ©, ©. Recall that there 
are no actual queues in the system. Our algorithm enforces a 
strict 1-slot delay constraint so that $ s (t) = if the packet 
is not successfully delivered after 1 slot. The virtual queues 
Xi(t), Z s (t) are maintained only in software and act as known 
weights in the optimization © that guide decisions towards 
achieving our time average power and reliability goals. The 
control action X* (t) that optimizes © affects the powers Pi (t) 
allocated and the <& 8 (t) value according to ©. 

The above optimization is a 2-stage stochastic shortest 
path problem [26] where the two stages correspond to the 
two phases of the underlying cooperative protocol. Specif- 
ically, when s decides to use the option of transmitting 
cooperatively, the cost incurred in the first stage is given by 
the first term (X 3 (t) + V/3 3 )E{P a (t)\Q(i),T(i)}. The cost 
incurred during the second stage is given by ^Z ien (Xi(t) + 
VPi)E{Pi(t)\Q(t),T(t)} and at the end of this stage, we 
get a reward of (Z 3 (t) + Va 3 )E{$ 3 (i)\Q(t),T(t)}. The 
transmission outcome & s (i) depends on the power allocation 
decisions in both phases which makes this problem different 
from greedy strategies (e.g., |[T8lL ll23l ). In order to determine 
the optimal strategy in slot t, the source s computes the 
minimum cost of © for all transmission modes described 
earlier and chooses one with the least cost. 

Note that this problem is unconstrained since the long term 
time average reliability and power constraints do not appear 
explicitly as in the original problem. These are implicitly 
captured by the virtual queue values. Further, its solution uses 
the value of the current channel state T(t) and does not 
require knowledge of the statistics that govern the evolution of 
the channel state process. Thus, the control strategy involves 
implementing the solution to the sequence of such uncon- 
strained problems every slot and updating the queue values 
according to ©, ©. Assuming i.i.d. T (t) states, the following 
theorem characterizes the performance of this dynamic control 
algorithm A similar statement can be made for more general 
Markov modulated T(t) using the techniques of ll24l . For 
simplicity, here we consider the i.i.d. case. 

Theorem 1: (Algorithm Performance) Suppose all queues 
are initialized to 0. Then, implementing the dynamic algorithm 
© every slot stabilizes all queues, thereby satisfying the 
minimum reliability and time-average power constraints, and 
guarantees the following performance bounds (for some e > 
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that depends on the slackness of the feasibility constraints): 

lim lg E{z , (T)) <^±I<^i±EMM!!!) 

t^oo t € 

Further, the time average utility achieved for any V > 
satisfies: 



where 



r=0 
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i + a^ + e^(^) 2 + (^) 2 



Proof: Appendix A. □ 
Thus, one can get within 0(1/V) of the optimal values by 
increasing V at the cost of an 0(V) increase in the virtual 
queue backlogs. The size of these queues affects the time 
required for the time average values to converge to the desired 
performance. 

In the following sections, we investigate the basic 2 -stage 
resource allocation problem (|7]) in detail and present solu- 
tions for two widely studied classes of cooperative protocols 
proposed in the literature: Decode- and-Forward (DF) and 
Amplify-and-Forward (AF) Oil. HI. These protocols differ in 
the way the transmitted signal from the first phase is processed 
by the cooperating relays. In DF, a relay fully decodes the 
signal. If the packet is received correctly, it is re-encoded 
and transmitted in the second phase. In AF, a relay simply 
retransmits a scaled version of the received analog signal. 
We refer to 0, (U for further details on the working of 
these protocols as well as derivation of expressions for the 
mutual information achieved by them. Let m = \TZ\. In the 
following, we assume a Gaussian channel model with a total 
bandwidth W and unit noise power per dimension. We use the 
information theoretic definition of a transmission failure (an 
outage event) as discussed in fT9lh 120 1. Here, an outage occurs 
when the total instantaneous mutual information is smaller 
than the rate R at which data is being transmitted. 

We first consider the case when the channel gains are known 
at the source (Sec. [V]). In this scenario, © becomes a 2- 
stage deterministic shortest path problem because the outcome 
& s (t) due to any control decision and its power allocation can 
be computed beforehand. Specifically, 3> s (£) = 1 when the 
resulting total mutual information exceeds R and 3> s (£) = 
otherwise. Further, this outcome is a function of control 
actions taken over two stages when cooperative transmission 
is used. This resulting problem is combinatorial and non- 
convex and does not admit closed-form solutions in general. 
However, for these protocols, we can reduce it to a set of 
simpler convex programs for which we can derive quasi-closed 
form solutions. Then in Sec. [VH we consider the case when 



only the statistics of the channel gains are known. In this 
case, the outcome 3> s (t) is random function of the control 
actions (taken over the two stages in case of cooperative 
transmission) and © becomes a 2-stage stochastic dynamic 
program. While standard dynamic programming techniques 
can be used to compute the optimal solution, they are typically 
computationally intensive. Therefore, for this case, we present 
a Monte Carlo simulation based technique to efficiently solve 
the resulting dynamic program. 

V. 2-Stage Resource Allocation Problem with 
Known Channels, Unknown Statistics 

Recall that in order to determine the optimal control action 
in any slot t, we must choose between the four modes of 
operation as discussed in Sec. HH (1) direct transmission, 
(2) multi-hop relay, (3) cooperative, and (4) idle. Let Ci(t) 
and Ii(t) denote the optimal cost of the metric ©, and the 
corresponding action that achieves that metric, assuming that 
mode i G {1,2,3,4} is chosen in slot t. Every slot, the 
algorithm computes Ci(t) and Ii(t) for each mode and then 
implements the mode i and the resulting action Ii(t) that 
minimizes cost. Note that the cost c±(t) for the idle mode is 
trivially 0. The minimum cost for direct transmission can be 
computed as follows. When the source transmits directly, we 
have Pi (t) = Mi G 1Z. The minimum cost c\ (t) associated 
with a successful direct transmission = 1) can be 

obtained by solving the following convex problem q 

Minimize: (x a (t) + V^ a ) Ps (t) - Z s (t) - Va s 

Subject to: W\og (l + ^\h sd (t)\ 2 ^j > R 

0<P s (t) <p™ ax (8) 



where the constraint Wlog ^1 t -yp^\ii s d{ 



^IM*)| 2 ) > R rep- 
resents the fact that to get 3> s (t) = 1, the mutual infor- 
mation must exceed R. It is easy to see that if there is a 
feasible solution to the above, then for minimum cost, this 
constraint must be met with equality. Using this, the minimum 
cost corresponding to the direct transmission mode is given 
by: (X s (t) + V(3 s )pf r (t) - Z s (t) - Va s if Pf r (t) = 
_w_^ 2 r/w _ i) < p™ ax . Otherwise, direct transmission 
is infeasible and so we set c\(t) = +oo. In this case, direct 
transmission will not be considered as the idle mode cost 
c±{t) = is strictly better, but we must also compare with 
the costs C2(t) and c%(t). 

To compute the minimum cost C2 (t) associated with multi- 
hop transmission, note that in this case, the slot is divided into 
two parts (Fig. Gib)) and P%(t) > for at most one i G 1Z. 
This strategy is a special case of the Regenerative DF protocol 
(to be discussed next) that uses only 1 relay and in which 
the destination does not use signals received from the first 
stage for decoding. Therefore, the optimal cost for this can be 
calculated using the procedure for the Regenerative DF case 
by imposing the single relay constraint and setting h s d(t) = 0. 

2 Note that the term —Z s (t) — Va s in the objective is a constant in any 
given slot and does not affect the solution. However, we keep it to compare 
the net cost between all modes of operation. 
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Below we present the computation of the minimum cost 
cs(t) for the cooperative transmission mode under several 
protocols. In what follows, we drop the time subscript (t) for 
notational convenience. 



A. Regenerative DF, Orthogonal Channels 

Here, the source and relays are each assigned an orthogonal 
channel of equal size. An example slot structure is shown in 
Fig- E c ) m which the entire slot is divided into m + 1 equal 
mini-slots. In the first phase of the protocol, s transmits the 
packet in its slot using power P s . In the second phase, a subset 
U C 1Z of relays that were successful in reliably decoding the 
packet, re-encode it using the same code book and transmit to 
the destination on their channels with power Pi (where i eU). 
Given such a set U, the total mutual information under this 
protocol is given by ||3l : 



W 



log 1 



rn 



mP s 



\h sd \ 2 - 



ieu 



W 



This is derived by assuming that the receiver uses Maximal 
Ratio Combining to process the signals. As seen in the 
expression for the mutual information, such an orthogonal 
structure increases the SNR, but utilizes only a fraction of the 
available degrees of freedom leading to reduced multiplexing 
gain. 

Define binary variables Xi to be 1 if relay i can reliably 
decode the packet after the first stage and else. Then, for 
this protocol, © is equivalent to the following optimization 
problem: 

Minimize:(X s + V/3 8 )P 8 + ^(X, + Vfo)Pi -Z s - Va s 



ieiz 



W ( 
Subject to: — log 1 

m \ 

W 



mP s 
W 

mP s 

m v W 
0<P S < P™ ax 



\h -' 2 



^log(l 

m V 



E 

I 2 ) >x l R 



x^\h id \ 2 )>R 



W 



0<Pi< P t max ,xi g {0,1} Vi g K 



(9) 



The variables X{ capture the requirement that a relay can 
cooperatively transmit in the second stage only if it was 
successful in reliably decoding the packet using the first stage 
transmission. A similar setup is considered in fT2l but it treats 
the limiting case when W goes to infinity. Because of the 
integer constraints on Xi, © is non-convex. However, we can 
exploit the structure of this protocol to reduce the above to 
a set of m + 1 subproblems as follows. We first order the 
relays in decreasing order of their \h S i\ 2 values. Define Uk 
as the set that contains the first k (where < k < m) relays 
from this ordering. Let P^ k denote the minimum source power 
required to ensure that all relays in Uk can reliably decode the 
packet after the first stage. We note that for all values of P s in 
the range (PY k , Pt T +1 ), the relay set that can reliably decode 
remains the same, i.e., Uk- Thus, we need to consider only 
m + 1 subproblems, one for each Uk- The subproblem for any 



set Uk is given by: 

Minimize: (X s + V(3 S )P S + ^ (X z + V f5 % )P % -Z s - Va s 



ieu k 



W ( 
Subject to: — log 1 

m \ 



mP s 



mPj 



ieUk 



0<Pi< P t max 



V* G U k (10) 

This can easily be expressed as the following LP: 

Minimize: (X s + V/3 S )P S + ^ {X t + V0i)Pi - Z. - Va s 

ieu k 

Subject to: P s \h sd \ 2 + ^ Pi\h id \ 2 > 6 
ieu k 

pUk <i p <i p max 

OKPiKpr*** MieUk (11) 

where = W-(2 Rm l w - 1). The solution to the LP above 
has a greedy structure where we start by allocating increasing 
power to the nodes (including s) in decreasing order of the 



value of 
met. 



\h id \ z 



(where i e Uk U {s}) till any constraint is 



Therefore, for this protocol, the optimal solution to finding 
the cost cs(t) associated with the cooperative transmission 
mode in (|7]) can be computed by solving (ITTI) for each Uk 
and picking the one with the least cost. It is interesting to 
note that if we impose a constraint on the sum total power of 
the relays instead of individual node constraints, then due to 
the greedy nature of the solution to (Tut , it is optimal to select 
at most 1 relay for cooperation. Specifically, this relay is the 
one that has the highest value of 



\hid\l 



B. Non-Regenerative DF, Orthogonal Channels 

This protocol is similar to Regenerative DF protocol dis- 
cussed in Sec. IV-A1 The only difference is that here, in the 
second stage, the subset U C 1Z relays that were successful 
in reliably decoding the packet re-encode it using independent 
code books. In this case, the total mutual information is given 
by Iffl: 



— log 1 + —±\h 8d \ 2 ) + V — log (l 



ien 



Using the same definition of binary variables Xi as in Sec jV-AL 
we can express © for this protocol as an optimization problem 
that resembles ©. Similar to the Regenerative DF case, we 
can then reduce this to a set of m + 1 subproblems, one for 
each Uk- The subproblem for set Uk is given by: 

Minimize: (X s + V/3 8 )P 8 + ^ (X, + Vfc)Pi -Z s - Va s 

ieUk 

Subject to: 



log 1 



mP s 



\h x 



El mPi , , 
log (l + ^\h 

ieUk 



2 \ mR 

id| ) - w 



w 



pUk <i p <i pmax 

0<Pi< P max Vi e U k 



(12) 
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The above problem is convex and we can use the KKT 
conditions to get the optimal solution (see Appendix B for 



details). Define [x]q' """"A min[max(x, 0), P max ]. Then the 
solution to the subproblem for set Uk is given by: 

pmax 



w 



X s + V(3 S 



m\h sd \ 
W 



Ps 

pm 



v u 19 VieUk (13) 

LXi + V0i m\h id \ 2 lo 

where v* > is chosen so that the total mutual information 
constraint is met with equality. Therefore, the optimal solution 
for the cost cs(t) in © for this protocol can be computed 
by solving (fT3l) for each Uk and picking one with the least 
cost. We note that the solution above has a water-filling type 
structure that is typical of related resource allocation problems 
in static settings. 



C. AF, Orthogonal Channels 

In this protocol, the source and relays are again assigned an 
orthogonal channel of equal size. An example slot structure 
is shown in Fig. Htc). However, instead of trying to decode 
the packet, the relays amplify and forward the received signal 
from the first stage. The total mutual information under this 
protocol is given by [|T3ll |fl6l : 



-log U + ~\JW~ 

m \ W 



(M 2 + j>i) 



where ^ ^,4^ ■f+w/m - Usin g this ' we can ex P ress 
Qi for this model as follows. 

Minimize: (X s + V(3 S )P S + ^(X; + Vfa)Pi - Z. - Va s 

ien 

Subject to: ^ log f 1 + ^ (\h sd \ 2 + J» ) > il 

m V ien / 



o < p s < p; 

< Pi < P" 



v* g n 



(14) 



This problem is non-convex. However, if we fix the source 
power P s , then it becomes convex in the other variables. 
This reduction has been used in lfT6l as well, although it 
considers a static scenario with the objective of minimizing 
instantaneous outage probability. After fixing P s , we can 
compute the optimal relay powers for this value of P s by 
solving the following: 



Minimize: + V/3i)Pi - Z s - Va s 

ien 

Subject to: P s \h sd \ 2 + ^ P s ^i > 



ien 

0<Pi< P™ ax Miell (15) 

where = W-(2 Rrn ^ w - 1). The first constraint can be 
simplified as: 

Ps\h sd ? + EienPs^i = p s(\h sd \ 2 + E^l^l 2 ) - 

P*\h si \ A +P s \h si \ 2 W/m 

l^ien p s \h si \ 2 +Pi\h id \ 2 +w/m 



Since we have fixed P s , we can express (TT3T) as: 



Minimize: 



Subject to: 



X] PI/, .12 j_ PI/, ,12 



Va s 



P?\h„ 



P s \h si \ 2 W/m 



P s \h si \ 2 + 

o < Pi < pr ax 



P,\h id \ 2 + W/m 



< 



(16) 



where & = P s (\h sd \ 2 + J2 i& n s \ h *i?) ~ °- Usin g the 
KKT conditions, the solution the above convex optimiza- 
tion problem is given by (see Appendix C for details): 

pmax 

p* _ [ / v*(P*\h si \4+P s \h si \iW/m) _ P s \h s% \ 2 +W/m y % 

r i ~ LV (x,+yA)l^dl 2 l^dl 2 J 

where v* > is chosen so that the second constraint is met 

with equality. We note that this solution has a water-filling type 

structure as well. Therefore, to compute the optimal solution 

to © for this protocol, we would have to solve the above for 

each value of P s G [0, P s ma:c ] . In practice, this computation can 

be simplified by considering only a discrete set of values for 

P s . Because we have derived a simple closed form expression 

for each P s , it is easy to compare these values over, say, a 

discrete list of 100 options in [0, P™ ax ] to pick the best one, 

which enables a very accurate approximation to optimality in 

real time. 



D. DF with DSTC 

In this protocol, all the cooperating relays in the second 
stage use an appropriate distributed space-time code (DSTC) 
El so that they can transmit simultaneously on the same 
channel. The slot structure under this scheme is shown in 
Fig Ed). Suppose in the first phase of the protocol, s transmits 
the packet in the first half of the slot using power P s . In the 
second phase, a subset U C 1Z of relays that were successful 
in reliably decoding the packet, re-encode it using a DSTC 
and transmit to the destination with power Pi (where i G U) 
in the second half of the slot. Given such a set U, the total 
mutual information under this protocol is given by 0: 



2Ps 



h 



sd\ 



The factor of 2 appears because only half of the slot is being 
used for transmission. As seen in the expression above, unlike 
the earlier examples, this protocol does not suffer from reduced 
multiplexing gains due to orthogonal channels. 

We can now express (|7]) for this protocol as follows. Define 
binary variables X{ to be 1 if relay i can reliably decode the 
packet after the first stage and else. Then, for this protocol, 
is equivalent to the following optimization problem: 



Minimize: (X s + Vf3 s )P s + J2( x i + V$i) p i ~ z 

E 

ien 

^log(l + -^|^| 2 )>^ 



W / 
Subject to: — log (1 

W. 



2P S 
W 

2P, 



Va s 



2P \ 
x~\h id \ 2 j >R 



W 



0<P S < P™ ax 
< P < iT"^, 



G {0, 1} Vz G K 



(17) 
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By comparing the above with (|9]), it can be seen that the 
computation of minimum cost under this protocol follows the 
same procedure as described in Sec. IV-AI of solving m + 1 
subproblems, each an LP, by ordering the relays greedily and 
hence we do not repeat it. 

E. AF with DSTC 

Here, all cooperating relays use amplify and forward along 
with DSTC. The total mutual information under this protocol 
is given by: 



where ipi 




Pi\h ai \ 2 \h id \ 2 



Ps\h ai \*+Pi\h id \*+W/2 ' Usin g this ' we Can ex P reSS 
© for this model as follows. 



Minimize: (X s + Vf3 s )P s 



W [ 
Subject to: — log I 1 



W 

< P s < P™ ax 

o < r < pr ax 




Va a 



>R 



VieK 



(18) 



This is similar to (ITU) and thus, we fix P s and use a similar 
reduction to get a convex optimization problem whose solution 
can be derived using KKT conditions and is given by: 

p* 



v*(P*\h si \ A +P s \h si \ 2 W/2) 

(x,+y/30l^d| 2 



P a \h ai \ 2 +W/2 
\h id \* 



where v* > is chosen so that the constraint on the total 
mutual information at the destination is met with equality. 

VI. 2-Stage Resource Allocation Problem with 
Unknown Channels, Known Statistics 

We next consider the solution to ^} when the source does 
not know the current channel gains and is only aware of 
their statistics. In this case, © becomes a 2-stage stochastic 
dynamic program. For brevity, here we focus on its solution 
for the cooperative transmission mode. 

Suppose the source uses power P s in the first stage. Let co 
denote the outcome of this transmission. This lies in a space ft 
of possible network states which is assumed to be of a finite 
but arbitrarily large size. For example, in the DF protocol, 
uo might represent the set of relay nodes that received the 
packet successfully after the first stage as well as the mutual 
information accumulated so far at the destination. For AF, uo 
can represent the SNR value at each relay node and at the 
destination. 

Let J*(P s ,lj) be the optimal cost-to-go function for the 2- 
stage dynamic program © given that the source uses power 
P s in the first stage and the network state is uo at the beginning 
of the second stage. Let Jq denote the optimal cost-to-go 
function starting from the first stage. Also, let lZ{uo) denote 
the set of relay nodes that can take part in cooperative 
transmission when the network state in uj. We define the 
following probabilities. Let f(P s ,u) be the probability that the 
outcome of the first stage is uo when the source uses power P s . 



Also, let g(P n ^\ : P s: uj) be the probability that the receiver 
gets the packet successfully when relays in 1Z(uj) use a power 
allocation Pn(u) an d the source uses power P s . Note that 
these probabilities are obtained by taking expectation over all 
channel state realizations. We assume these are obtained from 
the knowledge of the channel statistics. 

Using these definitions, we can now write the Bellman 
optimality equations [[26l for this dynamic program Vo; G ft: 



Jn = min 



(X s + V0 8 )P a + f(Ps,cu)J*(P s ,u) 



J*(P s ,u;)= mm [ (Xi + VPi)Pi 

- (Z s + Va s )g(P n{ 



Ps.Oo) 



(19) 



(20) 



While this can be solved using standard dynamic program- 
ming techniques, it has a computational complexity that grows 
with the state space size O and can be prohibitive when this is 
large. We therefore present an alternate method based on the 
idea of Monte Carlo simulation. 



A. Simulation Based Method 

Suppose the transmitter performs the following simulation. 
Fix a source power P s . Define Jq(P s ) as the optimal cost- 
to-go function given that the source uses power P s . Note that 
this is simply the expression on the right hand side of dT9b 
with P s fixed. Simulate the outcome of a transmission at this 
power n times independently using the values of f(P s1 uo). 
Let uoj G Vt denote the outcome of the j th simulation. For 
each generated outcome ujj, compute the optimal cost-to- 
go function J*(P 3 ,Wj) by solving (l20l) (this could be done 
using the knowledge of g(P '-^ P 3 ,lj) either analytically 
or numerically). Use this to update jQ St (P Sj n), which is an 
estimate of Jq(P s ) for a given P s after n iterations and is 
defined as follows: 



jest 
J 



(X 3 + Vf3 s )P s + - T JKPs^j) (21) 

<n — ' 



We now show that, for a given P s , jQ St (P Sj n) can be 
pushed arbitrarily close to the optimal cost-to-go function 
Jq(P s ) by increasing n. Since we have fixed P s , from (Q2K 
we have: 

JS(Ps) = (X s + Vf3 s )Ps + Yl f(Ps,u)J*(P8,U>) 

Define the following indicator random variables for each 
simulation j and \/co G ft: 

, .v _ J 1 if the outcome of simulation j is co 

^Ps,3) - | o dse 

Note that by definition E{1 0U (P S J)} = }(P s ,u). There- 
fore, we can express Jq st (P s , n) in terms of these indicator 
variables as follows: 

1 n 

J es£ (P s , n) =(X a + Vf3 s )P s + -VV I^PstJVKPs^) 

n ^— f ^-^ 

3=1 cueQ 
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We note that ( ^ ueQ j) J* (P 3 , ^)) are i.i.d. ran- 

dom variables with mean /i = ^2^^ f(P s ^)Ji(Ps^) 
and variance a 2 = T^uen f ( p s^)( J i( p s^)) 2 ~ M 2 - Using 
Chebyshev's inequality, we get for any e > 0: 



^£(^up s ,j)jr(p s x 



< 



This shows that the value of the estimate quickly converges 
to the optimal cost-to-go value. Thus, this method can be used 
to get a good estimate of the optimal cost-to-go function for 
a fixed value of P s in a reasonable number of steps. 
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VII. Multi-Source Extensions 

In this section, we extend the basic model of Sec. HI] to 
the case when there are multiple sources in the network. Let 
the set of source nodes be given by S. We consider the case 
when all source nodes have orthogonal channels In particular, 
we assume that in each slot, a medium access process \(t) 
determines which source nodes get transmission opportunities. 
For simplicity, we assume that at most one source transmits in 
a slot. This models situations where there might be a pseudo- 
random TDMA schedule that determines a unique transmitter 
node every slot. It also models situations where the source 
nodes use a contention-resolution mechanism such as CSMA. 
Our model can be extended to scenarios where more than 
one source node can transmit, potentially over orthogonal 
frequency channels. 

Let s(t) = s(x(t)) G S be the source node that gets a 
transmission opportunity in slot t. Then, the optimal resource 
allocation framework developed in Sec. |IV] can be applied as 
follows. A virtual reliability queue is defined for each source 
node s G S and is updated as in (0). Note that in slots where a 
source node s does not get a transmission opportunity, & 3 (i) = 
0. We assume that each incoming packet gets one transmission 
opportunity so that the delay constraint of 1 slot per packet 
only measures the transmission delay and not the queueing 
delay that would be incurred due to contention. Similarly, a 
virtual power queue is maintained for each node as in © 
including the source nodes and relay nodes. Note that in this 
model, it is possible for a source node to act as a relay for 
another source node when it is not transmitting its own data. 
We denote the set of relay nodes (that includes such source 
nodes) in slot t as TZ(t). 

Then the optimal control algorithm operates as follows. Let 
Q(t) denote the collection of all virtual queues in timeslot t. 
Every slot, given Q(t) and any channel state T(t), it chooses 
a control action T s {t) that minimizes the following stochastic 



3 For the non-orthogonal scenario, there will two sources of outages: 
transmission failure at the physical layer and delay violation due to contention 
in medium access. Hence, MAC scheduling in addition to physical layer 
resource allocation must be considered. This is not the focus of the current 
work. 



Fig. 2. A snapshot of the example network used in simulation. 

metric (for a given control parameter V > 0): 
Minimize: (X s{t) + V(3 s{t) )E {P s{t) \Q(t),T(t)} 

+ ]T (Xi(t) +V/3 i )E{P i (t)\Q(t),T(t)} 

- (Z s(t) +Va s(t) )E{$ s{t) \Q(t),T(t)} 
Subject to: < P s(t) < P^ x 

0<Pi(t) <P™ ax \fie1Z(t) 



gC 



(22) 



This problem can be solved using the techniques described for 
the single source case. 

VIII. Simulations 

We simulate the dynamic control algorithm © in an ad- 
hoc network with 3 stationary sources and 7 mobile relays as 
shown in Fig. [2] Every slot, the sources receive new packets 
destined for the base station according to an i.i.d. Bernoulli 
process of rate A and each packet has a delay constraint of 
1 slot. The sources are assumed to have orthogonal channels 
and can transmit either directly or cooperatively with a subset 
of the relays in their vicinity. We impose a cell-partitioned 
structure so that a source can only cooperate with the relays 
that are in the same cell in that slot. The relays move from 
one cell to the other according to a Markovian random walk. 
In the simulation, at the end of every slot, a relay decides to 
stay in its current cell with probability 0.8, else decides to 
move to an adjacent cell with probability 0.2 (where any of 
the feasible adjacent cells are equally likely). 

We assume a Rayleigh fading model. The amplitude squares 
of the instantaneous gains on the links involving a source, the 
set of relays in its cell in that slot and the base station are 
exponentially distributed random variables with mean 1. All 
power values are normalized with respect to the average noise 
power. All nodes have an average power constraint of 1 unit 
and a maximum power constraint of 10 units. 

We consider the Regenerative DF cooperative protocol 
over orthogonal channels and implement the optimal resource 
allocation strategy as computed in ([TIT) for this network. In 
the first experiment, we consider the objective of minimizing 
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the average sum power expenditure in the network given 
a minimum reliability constraint p s = 0.98 and input rate 
X s = 0.5 packets/slot for all sources. For this, we set a s = 
and Pi = 1. Fig. [3] shows the average sum power for 
different values of the control parameter V. It is seen that 
this value converges to 2.6 units for increasing values of V, 
as predicted by the performance bounds on the time average 
utility in Theorem 1. Fig. [4] shows the resulting average 
reliability queue occupancy. It is seen to increase linearly in 
V, again as predicted by the bound on the time average queue 
backlog in Theorem 1. We emphasize again that there are 
no actual queues in the system, and all successfully delivered 
packets have a delay exactly equal to 1 slot. The fact that 
all reliability queues are stable ensures that we are indeed 
meeting or exceeding the 98% reliability constraint. Indeed, 
in our simulations we found reliability to be almost exactly 
equal to the 98% constraint, as expected in an algorithm 
designed to minimize average power subject to this constraint. 
We further note that the instantaneous reliability queue value 
Z(t) represents the worst case "excess" packets that did not 
meet the reliability constraints over any interval ending at time 
t, so that maintaining small Z(t) (with a small V) makes the 
timescales over which the time average reliability constraints 
are satisfied smaller. 

In the second experiment, we choose both a s = and 
Pi = so that © becomes a feasibility problem. We fix the 
average and peak power values to 1 and 10 respectively and 
implement (TTTb for different rate-reliability pairs. In Table H 
we show whether these are feasible or not under three resource 
allocation strategies: direct transmission, always cooperative 
transmission and dynamic cooperation (that corresponds to 
implementing the solution to (TTTb every slot). It can be seen 
that dynamic cooperation significantly increases the feasible 
rate-reliability region over direct transmission as well as static 
cooperation. For example, it is impossible to achieve 95% 
reliability using direct transmission alone, even if the traffic 
rate is only 0.2 packets/slot. This can be achieved by an 
algorithm that uses the cooperation mode (mode 3) always, 
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but optimizes over the power allocation decisions of this 
cooperation mode as specified in previous sections. However, 
always using cooperation fails if we desire 98% reliability, 
but using our optimal policy that dynamically mixes between 
the different modes, and chooses efficient power allocation 
decisions in each mode, can achieve 98% reliability, even at 
increased rates up to 0.6 packets/slot. 

IX. Conclusions 

In this paper, we considered the problem of optimal resource 
allocation for delay-limited cooperative communication in a 
mobile ad-hoc network. Using the technique of Lyapunov 
optimization, we developed dynamic cooperation strategies 
that make optimal use of network resources to achieve a 
target outage probability (reliability) for each user subject to 
average power constraints. Our framework is general enough 
to be applicable to a large class of cooperative protocols. 
In particular, in this paper, we derived quasi-closed form 
solutions for several variants of the Decode- and-Forward and 
Amplify-and-Forward strategies. 

Appendix A: Proof of Theorem 1 

Here, we prove Theorem 1 by comparing the Lyapunov drift 
of the dynamic control algorithm © with that of an^optimal 
stationary, randomized policy. Let r* and e* Mi G 1Z denote 
the optimal value of the objective in ©. Then we have the 
following facfl 

Existence of an Optimal Stationary, Randomized Policy: 
Assuming i.i.d. T(t) states, there exists a stationary 
randomized policy 7r that chooses feasible control action 
T^(t) and power allocations Pfit) for all i G 1Z every slot 
purely as a function of the current channel state T(t) and 

4 This can be shown using the techniques developed in (23]. 
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(rate, reliability) = (A s ,p s ) 


(0.1, 0.9) 


(0.2, 0.9) 


(0.2, 0.95) 


(0.5, 0.95) 


(0.5, 0.98) 


(0.6, 0.98) 


(0.7, 0.99) 


direct transmission 


/ 


/ 


X 


X 


X 


X 


X 


always cooperate 


/ 


/ 


/ 


/ 


X 


X 


X 


optimal strategy 


/ 


/ 


/ 


/ 


/ 


/ 


X 



TAB LI: I — 

Table showing the feasibility of different rate-reliability pairs. 



yields the following for some e > 0: 

E >p fl A s + e (23) 

E{P?(t)} + e<P? V9 (24) 
{$?(*)} -J2^{ P Ht)} = a S r* s - fte* (25) 

Let Q(t) = (Z a (t),Xi(t)) Vi e Ti represent the collection 
of these queue backlogs in timeslot t. We define a quadratic 
Lyapunov function: 



Also define the conditional Lyapunov drift A(Q(t)) as 
follows: 

A(g(t))A E {L(Q(t + 1)) - L(fi(t))|fi(t)} 

Using queueing dynamics ©, ©, the Lyapunov drift under 
any control policy can be computed as follows: 

A(g(t)) < B-Z s (t)E{*.(t)-p.A s (t)|g(t)} 

-^XiWE^-PiWIfiCt)} (26) 

where B = 

For a given control parameter V > 0, we subtract a "reward" 
metric FE{a s $ s (t) - T, ie n A-P*(*)lfi(*)} from both sides 
of the above inequality to get the following: 

A(Q(t)) - VE | - 2 Ai'iWIfiwj < B 

- Z s (t)E{^ s (t) - p s A s (t)\Q(t)} 

-j2Mm{Pi V9 -m)\Q(t)} 

- VE |a a $ a (t) - ^ ftPi(t)|fi(t) J (27) 

From the above, it can be seen that the dynamic control al- 
gorithm d7]) is designed to take a control action that minimizes 
the right hand side of (f2Tb over all possible options every slot, 
including the stationary policy tt. Thus, using (l23k (|24l) . (l25k 
we can write the above as: 



A(fi(*)) - VE |a a $ a (t) - ^ A^WleWj < B 

-Z s (t)e-J2 X i(t)£-Va s r* s -J2& e t ( 28 ) 
Theorem 1 now follows by a direct application of the Lya- 



punov optimization Theorem l24l . 
Appendix B - Solution to Non-Regenerative DF 

ORTHOGONAL USING KKT CONDITIONS 

We ignore the constant terms in the objective. It is easy to 
see that the first constraint in (TT2t must be met with equality. 
The Lagrangian is given by: 

C =(X a + Vp a )Ps + + - x s(P s ~ P^ k ) 

ieu k 

- J2 x i p i + ps( p s - p? ax ) + E &( p < - p * nax ) 

+ v [ log(l + # S P S ) + lo g(! + W - ^ 

where S = = wIM 2 - The KKT conditions for 

all z EUk are: 

a:(p;-j**) = o a*p; = o 

# (P* - P™ ax ) = p* (P* - P™ ax ) = 



(X s + Vf3 s )-\* a +f3* s + 
(X i + Vp i )-\* + (3* + 



1 + S P S * 
1 + Oi P* 



= 
= 



If v* > 0, then we must have that A* -(3* > and A* -f3* > 
for all z. This would mean that P s * = P s Wfc and P* = 0. For 
some v* < 0, we have three cases: 

1) If A* = ft, we get Pf = - j- 

2) If A* > (3*, then we must have A* > and we get 
P*=0 

3) If A* < /3*, then we must have f3* > and we get 

p* TDmax 

i i 

Similar results can be obtained for P*. Combining these, we 
get: 



x s +vp s e s 



J .s 



— V* J_ 

X z +Vf3* 6 * 



where [X] maa: denotes min[max(X, 0), P m ax] 
Appendix C - Solution to AF orthogonal using 

KKT CONDITIONS 

It is easy to see that the first constraint in (TT61) must be met 
with equality. The Lagrangian is given by: 



s 



2 ^ - 14 P s \h si \ 2 W/m 



p i\hsi 



\h si \ 2 P s + \h id \ 2 Pi + W/m 



13 



The KKT conditions for all i £ 1Z S are: 

\*P* = (3*(P* - P™ ax ) = A* , /?* > 

x . , a ,_ ^|fa| 2 (f s 2 l^.| 4 + fil^l 2 %) 
(X i + T/A)-A,+A - {]hsim + lhidl 2 P: + w/m) 2 

If z/* < 0, then we must have that A* - (3 * > for all z. This 
would mean that P* = 0. For some v* > 0, we have three 
cases: 

1) It \ -p., we get P { - y ( x,+y/3,)|^| 2 

P s |/i^| 2 +H^/m 
i^l 2 

2) If A* > /?*, then we must have A* > and we get 

P* = 

3) If A* < /?*, then we must have /?* > and we get 

r) _ Tj rnax 
i i 

Combining these, we get: 

p* 

r i 

where [X]Q max denotes mm[max(X, 0), P m ax] 



/ v*(PZ\h si \4 + P s \h si \ 2 W/7^j 

V (x i +y/3 i )|/i id | 2 



P s \h si \ 2 +W/m 
\h id \* 
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